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O ' Abstract 

CJ . It is shown that, in a precise sense, if there is no bound on the number of faulty 

processes in a system with unreliable but fair communication, Uniform Distributed 
Coordination (UDC) can be attained if and only if a system has perfect failure 
detectors. This result is generalized to the case where there is a bound t on the 
number of faulty processes. It is shown that a certain type of generalized failure 
detector is necessary and sufficient for achieving UDC in a context with at most t 
faulty processes. Reasoning about processes' knowledge as to which other processes 
are faulty plays a key role in the analysis. 
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1 Introduction 



Periodically coordinating specific actions among a group of processes is fundamental 
to solving most distributed computing problems, and especially to replication schemes 
that achieve fault tolerance. Unfortunately, as is well known, it is impossible to achieve 
coordination in an asynchronous setting even if there can be only one faulty process 
[FLP85]. This is true even if communication is reliable. As a result, there has been a great 
deal of interest recently in systems with failure detectors [CT96], oracles that provide 
suspicions as to which processes in the system are faulty. This interest is heightened by 
results of Chandra, Hadzilacos, and Toueg [CT96, CHT96] showing that consensus can 
be achieved with relatively unreliable failure detectors 

Here we consider what kind of failure detectors are necessary to attain Uniform Dis- 
tributed Coordination (UDC) [GT89]. We have UDC of action a if, whenever some 
process (correct or not) performs a, then so do all the correct processes. There are two 
features that distinguish UDC from consensus. First, if a process that initiates an action 
is later found to be faulty, in the UDC setting, all the processes must still perform the 
action. On the other hand, in the case of consensus, the nonfaulty processes can agree 
not to perform the action. This property of UDC is particularly important in practice. 
Consider for example, a group of processes implementing fault-tolerant service; actions 
are executed on behalf of clients and change the state of the service (for example, allocat- 
ing a scarce resource). In the UDC setting, the service cannot repudiate an action should 
the member eventually be deemed faulty, as could be the case in consensus. With UDC, 
the service is required to make that action part of the service's communal history. From 
the client's point of view, the eventual designation of a group member as faulty is irrele- 
vant; indeed, one goal of using replication to implement a service is to mask failures from 
clients. A second difference between UDC and consensus is that, in consensus, processes 
must typically choose exactly one out of two actions ( "attack" or "retreat" ; or, "decide 
0" or "decide 1"). On the other hand, in UDC, there is no choice to be made; that is, 
UDC has no requirement that if action a is ever taken, then of necessity, action f3 is never 
taken. Thus, UDC suffices whenever actions to be taken by a group can be partitioned 
into non-conflicting subsets; it requires consenses to decide which of a conflicting set of 
actions to take. 

If we have reliable communication, then it is easy to see that we can attain UDC no 
matter how many processes may fail. Thus, in this setting, UDC is strictly easier than 
consensus. Intuitively, consensus requires all the correct processes to agree on a particular 
action. For example, they must all agree to attack or all agree to retreat (but cannot 
do both). With UDC, if one process attacks, all the correct processes must attack, and 
if one retreats, all must retreat. But it is perfectly consistent with UDC for the correct 
processes both to attack and to retreat. 

If communication is unreliable but fair, then we show that we can attain UDC even 
if there is no bound on the number of process failures (that is, even if there are runs in 
which all processes may fail) in the presence of weak failure detectors, which have the 
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property that eventually each faulty process is permanently suspected by at least one 
correct process (weak completeness) and at least one correct process is never suspected 
(weak accuracy). Chandra and Toueg [CT96] showed that consensus with an arbitrary 
number of failures is also achievable using weak failure detectors. They considered a 
setting with reliable communication, but their results apply with essentially no change 
to a setting where communication is unreliable but fair in an appropriate sense. 

Chandra and Toueg observed that by having processes communicate their suspicions, 
a weak failure detector can be converted to a strong failure detector, which satisfies 
weak accuracy and strong completeness (all correct processes eventually permanently 
suspect every faulty process). We further show that, under an assumption about the 
independence of process failures, in systems with no bound on the number of faulty pro- 
cesses, strong failure detectors are equivalent to perfect failure detectors, which satisfy 
strong completeness and strong accuracy — no process is suspected until it crashes. (In- 
deed, under the same conditions, weak failure detectors are equivalent to perfect failure 
detectors.) 

These results tell us that, if there is no bound on failures, then we can attain UDC 
using what are effectively equivalent to perfect failure detectors. Are perfect failure 
detectors really necessary? We show that in a precise sense they are. Under quite minimal 
assumptions, perfect failure detectors can be implemented in a system that attains UDC 
with no bounds on the number of failures. 1 It is interesting to note that Schiper and 
Sandoz' Uniform Reliable Multicast [SS93] is a special case of UDC where the only action 
of interest is reliable message delivery. Schiper and Sandoz implement Uniform Reliable 
Multicast by using the Isis virtual synchrony model [BJ87], which simulates perfect failure 
detection. Our results support their need to implement it in this way. 

What happens if there is a bound on the number of faulty processes? Gopal and Toueg 
[GT89] show that UDC is achievable with no failure detectors in systems where fewer 
than half the processes can fail. Here we generalize these results, providing, for each value 
of t, a generalized failure detector that we can show is necessary and sufficient to attain 
UDC if there are at most t failures. The generalized failure detector we consider reports 
suspicions of the form "at least k processes in a set S of processes are faulty" (although 
it does not specify which k are the faulty ones). Such generalized failure detectors may 
be appropriate when the system can be viewed as consisting of a number of components, 
and all we can say is that some process in a component is faulty, without being able to 
say which one it is. 

The rest of this paper is organized as follows. In Section 2, we provide the necessary 
background, reviewing the formal model, failure detectors, the formal language, and the 
definition of UDC. In Section 3, we present our analysis in the case that there is no bound 
on the number of faulty processes. Our proof techniques may be of independent interest, 
since they make nontrivial use of the knowledge-theoretic tools of Fagin et al. [FHMV95]. 

1 We remark that our notion of "implement" is stronger than the notion of reduction used by Chandra, 
Hadzilacos, and Toueg [CT96, CHT96]; see Section 3. 



2 



Reasoning about the knowledge of the processes in the system — particularly, their knowl- 
edge of which other agents are faulty — plays a key role in the analysis. In Section 4, we 
extend this analysis to the case where there is a known bound t on the number of faulty 
processes; we also introduce our generalized failure detectors. We conclude in Section 5 
with a discussion of the results and a comparison of our results to results of Aguilera, 
Toueg, and Deianov [ATD99] who, in response to the conference version of this paper 
[HR99], provided an alternative characterization of the type of failure detectors needed 
to attain UDC. Proofs are relegated to the Appendix. 

2 Background 

In this section, we briefly discuss the formal model (and, in particular, our assumptions 
about message delivery), failure detectors, the formal language that we use for expressing 
coordination, which includes operators for knowledge and time, and the notion of UDC. 

2.1 The Model 

We adopt the familiar model of an asynchronous distributed system. We assume that 
there is a fixed finite set Proc = {p±, . . . ,p n } of processes with no shared global clock. 
These processes communicate with one another by passing messages over a completely 
connected network of channels. Processes fail by crashing and do not recover, but other- 
wise follow their assigned protocols. Channels are not reliable. A message that is sent is 
not necessarily received and, even if it is received, there is no upper bound on message 
transmission delay. However, channels do not corrupt messages (so that every message 
received is one that was actually sent) and they are fair, in the sense that if the same 
message is sent from p to q infinitely often and q does not crash, then the message is 
eventually received infinitely often by q. 

Processes and the environment (or nature) execute actions; corresponding to each 
action is an event (intuitively, the event of that action occurring). We assume that the 
events that take place at a particular process are totally ordered, and are recorded in 
that process's history. The events recorded in p's history include communication events 
of the form send p (q, msg) (p sends message msg to q) and recv p (q, msg) {p receives msg 
from q); internal events, which include events of the form do p (a) (p executes action a) 
and init p (a) (p initiates a; see Section 2.4); the special event crash p , which models the 
failure of p; and failure-detector events, which are discussed in Section 2.2. 

A history for process p, denoted h p , is a sequence of events corresponding to actions 
performed by process p. A cut is a tuple of finite process histories, one for each p e Proc. 
A run is a function from time (which we take to range over the natural numbers, for 
simplicity) to cuts. If r is a run, we use r p (m) to denote p's history in the cut r(m). A 
pair (r, to) consisting of a run r and a time to is called a point. We write (r, to) ~ p (r', to') 
if r p (m) = r' p {m'). We say that a run r' extends a point (r, to) if r'(m') = r(m') for all 
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m! < m. Thus, r' extends (r, m) if r and r' have the same prefix up to time m. Process 
q is faulty in run r iff crash q is in g's history. F(r) denotes the faulty processes in run r. 

We assume that a run r satisfies the following. 

Rl. r(0) = ((),...,( )) (that is, at time 0, each process's history is empty). 

R2. r v {m + 1) = r p {m) or r p {m + 1) is the result of appending one event to r p {m). 

R3. If recv q (p, msg) is in r q (m), then the corresponding send event send p (q, msg) is in 
r p (m). 

R4. If the event crash p is in r p (m), then it is the last event in r p {m). 

R5. If the number of occurrences of send p (q, msg) in r p {m) grows unboundedly as m 
increases, then either the event crash q appears in r q {m) for some m or the number 
of occurrences of recv q (p, msg) in r q {m) grows unboundedly as m increases. (Infor- 
mally, if in run r process p sends msg infinitely often to q, then either q crashes or 
q receives msg infinitely often.) 

When we consider failure detectors, we add further conditions to runs. 

A system is a set of runs. Systems are typically generated by protocols executed in 
a certain context. Formally, a protocol for process p is a function from finite histories to 
actions. A joint protocol is a tuple (P 1 , . . . , P n ) consisting of a protocol for each process 
in Proc. A run r is consistent with a joint protocol P if, for all times mi, if r p {m\ + 1) = 
r p (mi) • e and e is an event corresponding to a protocol action, then e is in fact the 
event corresponding to the action Pi{r p (m-\)). A context for us is simply a bound on the 
number of processes that can fail (if there is such a bound), a specification of properties 
of failure detectors (see Section 2.2, and a specification of communication properties 
(whether communication is reliable, fair, etc.). Fagin et al. [FHMV95, FHMV97] give a 
more general definition of context, but this suffices for our purposes). In a given context, 
a joint protocol generates the system consisting of all the runs satisfying R1-R5 and the 
constraints of the context that are consistent with the protocol. We say that a joint 
protocol has a certain property in a given context if the system it generates in that 
context has that property. Note that, because all runs in the systems we consider are 
assumed satisfy R5, we are restricting in this paper to systems where communication is 
fair, although possibly unreliable. 

2.2 Failure Detectors 

Informally, a failure detector [CT96] is a per-process oracle that emits suspicions regard- 
ing other processes' faultiness. The fact that a process q is suspected by process p's 
failure detector does not mean that q is in fact faulty. Various failure detectors can be 
defined by imposing conditions on the accuracy and completeness of suspicions. 
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Chandra and Toueg [CT96] model failure detectors by assuming a function H such 
that H(p, t) describes the suspicions of p's failure detector at time t. Chandra and 
Toueg then assume that processes explicitly query their failure detectors to "learn" those 
suspicions. Our approach is slightly more general. We model the act of p getting a report 
x from its failure detector by the event suspect p (x). A process p could "get a report 
from its failure detector" either because it explicitly reads it (as Chandra and Toueg 
assume) or because the failure detector automatically emits a suspicion. A standard 
report is one of the form "the processes in S are faulty" , which we model by the report 
suspect p (S). A standard failure detector is one whose reports are standard. In a system 
with standard failure detectors, at each point p, define Suspects p (r, to) — S if and only if 
suspect p (S) is the most recent failure-detector event in r p {m). (If there have not been any 
reports by time to in r, Suspects p (r, to) = 0.) We will shortly generalize the definition of 
Suspects p (r, to) so that it applies in the presence of (some) nonstandard failure detectors. 

The differences between our way of modeling failure detectors and the Chandra- Toueg 
approach are mainly cosmetic. In the Chandra- Toueg approach, what we are calling a run 
consists of the actions performed by the processes (including reading the failure detector) 
and a special tape or "oracle" that describes the responses when the failure detector is 
read. We have used our approach so as to be able to capture all the behavior of the system 
in terms of histories, without invoking any extra structure (such as extra tapes). It is easy 
to translate from runs in the Chandra- Toueg framework to runs in our framework, and 
vice versa. Given a run in the Chandra- Toueg framework, the corresponding run in our 
framework uses the event u suspect p (x)" indicates both that p read its special tape and 
that the response was x. Conversely, given a run in our framework, the corresponding 
run in the Chandra- Toueg framework has p query its failure detector and receive response 
x at exactly the points where the event suspect (x) appears in its history. 

Although there is a one-to-one mapping between runs in our framework and runs in 
the Chandra- Toueg framework, the systems (i.e., sets of runs) that we allow are slightly 
more general than the systems they consider. Chandra and Toueg essentially consider 
only systems that are a cross-product of the set of possible special tapes and the set of 
possible actions performed by the processes. That is, no correlation is allowed between 
the behavior of the processes and the behavior of the failure detector. We do allow 
correlation, and thus can consider types of failure detectors that Chandra and Toueg 
cannot (see below). However, it would be easy to extend the Chandra- Toueg framework 
to allow such correlation. 

Consider the following properties of standard failure detectors (the first four are also 
used by Chandra and Toueg): 

Strong Accuracy: No process is suspected before it crashes. Formally, for all processes p 
and q and times to, if q e Suspects p (r, m), then crash q is in r q (m). 

Weak Accuracy: If there is a correct process, then some correct process is never suspected. 
Formally, if F(r) 7^ Proc then there is some q ^ F(r) such that, for all processes p 
and times to, q Suspects p (r, to). 
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Strong Completeness: All faulty processes are eventually permanently suspected by all 
correct processes. Formally, if q G F(r) and p ^ F(r), then there is a time m such 
that for all m' > m, q G Suspects p (r, ml). 

Weak Completeness: Each faulty process is eventually permanently suspected by some 
correct process. Formally, if q G F(r) and F(r) ^ Proc, then there exists some 
p F(r) and a time m such that, for all m' > m, q G Suspects p (r, m'). 2 

Impermanent Strong Completeness: All faulty processes are eventually suspected (but 
not necessarily permanently) by all correct processes. Formally, if q G F(r) and 
p ^ F(r), then there is some time m such that q G Suspects p (r, m). 

Impermanent Weak Completeness: Each faulty process is eventually suspected (but not 
necessarily permanently) by some correct process. Formally, if q G F(r) and F(r) ^ 
Proc, then there is some p ^ F(r) and time m such that q G Suspects p (r, m). 

A system IZ is said to satisfy a given property of failure detectors (e.g., weak completeness 
or impermanent strong completeness) if the failure detectors in every run of TZ satisfy 
the property. 

We remark that impermanent strong and weak completeness cannot meaningfully be 
captured in the Chandra- Toueg framework because, as we mentioned earlier, Chandra 
and Toueg do not allow correlation between the behavior of processes and the behavior 
of the failure detector. The only way that a special tape can guarantee impermanent 
strong completeness is to ensure that eventually, whenever the tape is constructed, it will 
report a failure. (Otherwise it might be consulted only at times when it does not report 
a failure.) Impermanent strong completeness requires a correlation between the special 
tape and the actions of the processes of a sort not allowed by Chandra and Toueg. 

Chandra and Toueg define a perfect failure detector as one that satisfies strong com- 
pleteness and strong accuracy, a strong failure detector as one that satisfies strong com- 
pleteness and weak accuracy, and a weak failure detector as one that satisfies weak com- 
pleteness and weak accuracy. We define an impermanent- strong failure detector as one 
that satisfies impermanent strong completeness and weak accuracy and an impermanent- 
weak failure detector as one that satisfies impermanent weak completeness and weak 
accuracy. 

The definitions above have focused on standard failure detectors, whose reports have 
the form "the processes in S are faulty". However, other types of reports can also be 
used, as long as they can be viewed as saying that the processes in some set S are faulty. 
For example, a report of the form "the processes in Proc — S are correct" can be clearly 
viewed as saying the processes in S are faulty. To make this precise, we say that a failure 
detector is g -standard if g is a function mapping the reports of the failure detector to 

2 Chandra and Toueg do not require that F(r) ^ Proc in their definition of weak accuracy or weak 
completeness, since they assume that there always is at least one correct process. We have added it here 
since we allow runs where all processes fail. 
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subsets of Proc. Thus, the failure detector that reports that the processes in Proc — S 
are correct (such failure detectors are used in [ATD99] , for example) is g-standard, where 
g("the processes in Proc — S are correct") — S. Up has a ^-standard failure detector, 
define Suspects p (r, m) — S if and only if suspect p (x) is the most recent failure-detector 
event in r p (m) and g(x) = S. Notions of strong accuracy, weak accuracy, and so on 
now apply to ^-standard failure detectors with no change in the definition. Although 
we consider only standard failure detectors in this paper, all of our results apply to 
^-standard failure detectors as well. 

Chandra and Toueg show that a failure detector satisfying weak completeness can 
be converted to one satisfying strong completeness, while still preserving accuracy prop- 
erties. Roughly speaking, all processes just communicate and tell each other about the 
suspicions reported by their original failure detectors; their modified failure detector re- 
ports all the suspicions they hear about. The same construction can be used to convert 
a failure detector satisfying weak impermanent completeness to one satisfying strong 
impermanent completeness. 

We need to be a little careful in making precise in our framework the notion of 
converting one type of failure detector to another. In the simplest case, given a system 
7Z, it is simply a question of considering a system 1Z' where each run r e 1Z is replaced by 
a run r' G 1Z' such that each occurrence of an event of the form suspect (x) is replaced 
by a different failure-detector event suspect Jx'), reflecting the modified failure detector. 
However, if the conversion also involves additional communication (as in the conversion 
from weak completeness to strong completeness) , the naive replacement of failure-detector 
events does not suffice. Nevertheless, the basic notion of conversion remains the same. 
Namely, we assume that there is some function / mapping runs to runs such that all 
the events in r (except possibly the failure-detector events) appear as events in f(r), 
and appear in the same order in r and f(r). However, /(r) may have additional events, 
including additional communication between processes and new failure-detector events. 
These new failure-detector events are the ones that we consider in determining whether 
1Z' has failure detectors that satisfy properties such as strong completeness. In Section 3, 
we use a particular instance of this conversion process to show how systems that allow 
solutions to UDC can be used to implement failure detectors with certain properties. 
For now, we leave it to the reader to check that Chandra and Toueg's conversion from 
failure detectors satisfying weak completeness to ones satisfying strong completeness can 
be implemented in this framework. This gives us the following result. 

Proposition 2.1: A system 1Z with weak (resp., impermanent- weak) failure detectors 
can be converted to a systemTZ' with strong (resp., impermanent- strong) failure detectors, 
while preserving accuracy properties. 

Note that we can trivially convert a failure detector that satisfies impermanent strong 
completeness to one that satisfies strong completeness by always outputting the list of all 
previously suspected processes. For convenience, we state this as a separate proposition. 
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Proposition 2.2: A system 1Z with impermanent-strong failure detectors can be con- 
verted to a system 1Z' with strong failure detectors, while preserving accuracy properties. 

As we show in Section 3, under a minimal assumption that should surely be satisfied 
in practice, if there is no bound on the number of faults (i.e., if there are runs where all 
processes may fail), a failure detector that satisfies weak accuracy must also satisfy strong 
accuracy. Thus, if there is no bound on the number of faults, then there is essentially no 
difference between impermanent-weak failure detectors and perfect failure detectors. 3 

2.3 The Formal Language 

Our language for reasoning about distributed coordination involves time and knowledge. 
The underlying notion of time is linear (so our language extends linear time temporal 
logic). We find it useful to be able to reason about the past as well as the future. 
Formally, we start with (application-dependent) primitive propositions and close under 
Boolean combinations, □, and the epistemic operators K p for each process p. 

Following [FHMV95], we define the truth of a formula relative to a tuple (TZ,r,m) 
consisting of a system 7Z, run r e 1Z, and time m. We write (1Z, r, m) |= ip if the formula 
ip is true at the point (r, m) in system 1Z. Among the primitive propositions in the 
language are send p (g, msg), recv g (p, msg), crash(p), do p (a), and init p (o;). The truth of 
these primitive propositions is determined by the cut in the obvious way; for example, 
send p (g, msg) is true at a cut precisely when send p (q, msg) is an event in p's history 
component of the cut. Dip holds at a point if <p holds from that point on in the run. 
Thus, (71, r, m) \= Dip if and only if (1Z, r, m!) \= ip for all m' > m. As usual, we 
define Oip = -iO-np; thus, O is the dual of □. It is easy to see that (lZ,r,m) |= Oip if 
(7Z, r, m') \= ip for some m' > m. Finally, K p ip is true if <p is true at all the points that 
p considers possible, given its current history. Formally, (7Z, r, m) |= K p ip if and only if 
(JZ, r', m') |= ip for all points (r', m!) ~ p (r, m) such that r' G 1Z. We say a formula ip is 
valid in system 1Z, denoted 7Z \= ip, if (JZ, r, m) |= ip for all points (r, m) in 1Z. 

In our analysis, we make particular use of local and stable formulas. A formula p is 
local to process p in system 7Z if at every point in 7Z, p knows whether ip is true, that is, 
ip is local to p in 1Z if K p ip V K p -^p> is valid in 1Z. All formulas describing a process's local 
state, for example, send p (g, msg), recv q (p, msg), crash(p), and init p (o;), are local to that 
process. It follows from standard properties of knowledge (see [FHMV95]) that formulas 
of the form K p (p are also local to p, since K p (K p ip) V K p (->K p p) is valid in every system. 
A stable formula is one that, once true, remains true; that is, ip is stable in system 7Z if 
(p =^ Dip is valid in TZ. All of send p (g, msg), recv q (p, msg), crash(p), init p (o;), and Dip are 
stable. 

3 In the notation of Chandra and Toueg, impermanent-W = impermanent-5 = S = Vfort = n — 1 
or t = n failures. 
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2.4 Distributed Coordination 



We are interested in modeling distributed coordination of certain actions among the pro- 
cesses in Proc. The actions may be allocating a resource, delivering multicast messages, 
or committing a transaction; we are not concerned with the specifics. We are also not 
concerned here with other requirements such as executing actions in a particular order 
(e.g., total-order multicast) or not executing conflicting actions (e.g., consensus). We are 
interested only in the eventual, distributed execution of these actions. 

Formally, we assume that each process p has a set A p of coordination actions that it 
can initiate. We assume that the sets A p and A q are disjoint for p ^ q. (Think of the 
actions in A p as somehow being tagged by p.) The fact that an action a is in A p does 
not mean that only p can perform a. However, it does mean that only p can initiate a; 
no process can perform a unless p initiates it. We assume that for each action a G A p , 
there is a special action init p (a) of p initiating a. The corresponding event init p (a) can 
appear only in p's history, and can appear at most once in a run. Formally, for the rest 
of this paper, we consider only systems TZ where, for all points (r, m) in TZ and all actions 
a G A p , the event init p (a) can appear only in r p {m) and can appear at most once in 
r p (m). 

Informally, a system satisfies Uniform Distributed Coordination (UDC) of action a if 
whenever any p' G Proc executes a G Ap, then so eventually does every correct q G Proc. 
In addition, no process performs a G A p unless p initiates it. Intuitively, if init p (a) 
appears in p's history and p is nonfaulty in run r, then all the nonfaulty processes in r 
should perform a. Formally, UDC of a G A p holds in a system TZ if the following three 
conditions hold: 

DC1. TZ \= initp(a) =>> 0(do p (a) V crash (p)); 

DC2. TZ h A gi ,, 2ePr oc(do 9 » 0(do 9 » V crash(g 2 ))); 

DC3. TZ h A 92ePr0 c(do q2 (a) =^ initp(a)). 

Non-Uniform Distributed Coordination (nUDC) requires coordination only if the pro- 
cess that performs a is correct. Thus, nUDC of a holds in a system TZ if DC1, DC3, and 
the following hold: 

DC2'. K h A, ll92eP roc(do gi (a) <>(do q2 (a) V crash(g 2 ) V crash(gi))). 

The next propositions show that, unlike UDC, nUDC is easy to attain, and that 
reliable communication is significant for UDC. (As we said earlier, all proofs are in the 
Appendix.) 

Proposition 2.3: There is a protocol that attains nUDC without the use of failure 
detectors in every context where communication is fair (although possibly unreliable), 
even if there is no bound on the number of failures. 
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Proposition 2.4: There is a protocol that attains UDC without the use of failure detec- 
tors in every context where communication is reliable, even if there is no bound on the 
number of failures. 

Propositions 2.3 and 2.4 distinguish UDC and nUDC from consensus. Unlike consen- 
sus, both UDC and nUDC are attainable in asynchronous systems with failures (although 
UDC needs reliable communication); indeed, they are attainable without failure detectors 
no matter how many processes may fail. However, as we shall see in the next two sections, 
things change when we consider UDC in a context with unreliable communication. 

3 UDC With No Bound on Failures 

We start by showing that UDC is achievable in a context with fair but unreliable com- 
munication, provided we have impermanent-strong failure detectors. 

Proposition 3.1: There is a protocol that attains UDC in every context with strong 
failure detectors, even if there is no bound on the number of failures. 

In light of Proposition 2.1 and 2.2, the following corollary is immediate. 

Corollary 3.2: There is a protocol that attains UDC in every context with impermanent- 
weak failure detectors, even if there is no bound on the number of failures. 

Chandra and Toueg [CT96] prove a result analogous to Proposition 3.1 for consensus. 
They show that consensus is achievable in every context where there are strong failure 
detectors, at most n — 1 failures, and where communication is reliable. Their algorithm 
works without change even if we have only impermanent-strong failure detectors and 
allow n failures. Moreover, their algorithm can be modified easily to deal with unreliable, 
but fair, communication. Thus, unlike UDC, the reliability of communication has no 
significant impact on the attainability of consensus in these contexts. 

We prove in Theorem 3.6 below that under certain assumptions about the context 
(which include the assumption that there is no bound on the number of failures along with 
our usual implicit assumption that communication is fair, although possibly unreliable), 
if processes can perform UDC then they can simulate perfect failure detectors. It follows 
from Proposition 3.4 below that, under these assumptions, strong failures detectors are 
equivalent to perfect failure detectors. Thus, we will be proving what is essentially a 
converse to Proposition 3.1. To prove this result, we need to make precise the notion of 
"simulating a perfect failure detector." 

"Simulating a perfect failure detector" means that we can convert a system 1Z to a 
system 1Z' with perfect failure detectors, using the same type of conversion as outlined 
in Section 2.2. We now sketch the conversion. Given a run r e 1Z, we construct a run 
f(r) such that 
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PI- /(r)p(O) = «>,...,<»; 



P2. if r p (m + 1) = r p {m) ■ e and e is not a failure-detector event, then f(r) p (2m + 2) = 
f(r) p (2m + 1) • e; if r p (m + 1) = r p (m) ■ e and e is a failure-detector event or if 
r p (m + 1) = r p (m), then f(r) p (2m + 2) = f(r) p (2m + 1); 

P3. (/(r)) p (2m+l) = (/(r)) p (2m)-suspec^(S'), where 5 = {g : (K,r,m) \= K p (crash (g))} 

Thus, in /(r), process p's history is identical to its history in r except that the failure- 
detector events in r are deleted in /(r), and, at each odd step in f(r), p's failure detector 
reports the processes that p knows have crashed at the corresponding point in 72. Now 
define system TV = {f(r) : r e 72.}. We say that 72 can simulate perfect failure detectors 
if the suspect! failure detectors in 72/ are perfect. We shortly give conditions on 72 that 
guarantee that it can simulate perfect failure detectors. 

As observed by Aguilera, Toueg, and Deianov [ATD99], our definition allows the 
simulating function / to be noncomputable. Technically, this is not quite right. The 
input to / is a run, which is an infinitary object, so it does not even make sense to 
consider the computability or noncomputability of /. However, it is easy to modify / so 
that its input and output are not complete runs, but rather prefixes of runs. Given a prefix 
of length m + 1 (i.e., given r(0), . . . , r(m) for some run m), / returns a prefix of length 
2m + 2. Conditions Pl-3 still make sense with that change. With that change, it is clear 
that / is computable provided that {q : (72, r, m) \= K p (crash(q))} is computable for each 
p, r, and m. While it is possible to construct systems where this set is not computable, 
it will be computable in any "reasonable" system. That is because whether K p (crash(q)) 
holds at the point (r, m) is typically determined by some easily characterizable sequence 
of events in p's history. While it is beyond the scope of this paper to characterize when 
/ is computable (it is not even clear how interesting such a characterization would be), 
it should be clear that it typically is computable 

The notion of simulation implicitly underlying this definition is more general than, 
but compatible with, the notion of reduction used by Chandra, Hadzilacos, and Toueg 
[CHT96]. For example, in this paper, it is shown that if consensus can be solved by 
means of a failure detector (and there are at most t < n/2 failures), then that failure 
detector can be transformed to a particular failure detector called OW (for eventually 
weak), which satisfies eventual weak accuracy and weak completeness; see [CT96] for the 
precise definition. Since consensus can be solved with <>W failure detectors, these failure 
detectors are viewed as the weakest failure detectors for consensus. 

The key point is that the results of Chandra, Hadzilacos, and Toueg do not apply if 
UDC is solved without the use of a failure detector. However, our notion of simulation 
does not depend on using failure detectors to attain UDC. Thus, it applies in situations 
where some other type of oracle is used, for example, an oracle that gives limited infor- 
mation about which actions have been initiated, in which case the reductions of Chandra, 
Hadzilacos, and Toueg may not apply at all. 
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We next describe some conditions on a system 72 that together will suffice to show 
that 72 can simulate perfect failure detectors. Before stating them, we need a definition. 

Definition 3.3: A formula ip local to q is said to be insensitive to failure by q in TZ if 
for all runs r,r' £ K and all times m,m', if r' q (m') = r q {m) • crash q , then (TZ,r,m) \= <p 
iff (TZ, r', to') |= (p. I 

Now consider the following five conditions on a system TZ. 

Al. If there exists a run G 72 where all the processes in S crash, and (r, m) is a point 
in 72 such that no process in Proc — S has crashed, then there is a run r' extending 
(r,m) such that F(r') = S. 

A2. For all runs ri,r 2 G 72. and times to, if F(ri) = F(r 2 ) = F and (v i , to) ~ g (r 2 ,TO) 
for all q ^ F, then there are extensions and r 2 of (ri, to) and (r 2 , to), respectively, 
such that all the processes in F crash by time to + 1 in r' x and r 2 and {r\,m') ~ 9 
(r 2 , to') for all to' > to and all q ^ F. 

A3. The formula i^ g init p (a) is insensitive to failure by g. 

A4. If 99 is (a) stable in TZ, (b) local to some process p in Proc, and (c) insensitive to 
failure by p, then for all points (r, to) in TZ, if there is some nonempty S C Proc 
such that (72, r, to) |= A^gs ~^K q (p, then there exists a point (r', to) such that (a) 
r' q (m) = r q (m) for all q G S; (b) for all q S, there is a (not necessarily strict) 
prefix h of r q (m) such that either r^(m) = h or r' q {m) = h ■ crash q and g crashes by 
time to in r; and (c) (TZ,r',m) \= ~iy?. 4 

A5(. For every S C Proc such that j^l < t, there exists a run rs G 72 such that 
F(r s )=S. 

We now briefly discuss these conditions and their implications. Al essentially says 
that process failures are independent of other events. If it is possible for the processes 
in S to crash, this may happen at any time in any run. A3 says that a process q cannot 
learn that p initiated a just by g's crashing. Al and A3 are properties we would expect 
to hold of all systems generated by protocols in the contexts of interest to us. 

A2 says that it is possible for all the faulty processes in r that have not crashed by 
time to to crash at the next step. More precisely, if two points (r, to) and (r', to) are 
indistinguishable to the correct processes in r, then there are extensions r\ and r 2 of these 
points that continue to be indistinguishable to all the correct processes in r, such that all 
the faulty processes in r have failed by time m+1 in r\ and r 2 . A2 implicitly assumes that 
there is no information relevant to the system beyond what is in the correct processes' 
states. In particular, this means that there cannot be completely reliable message buffers 

4 For those familiar with the notion of distributed knowledge [FHMV95], note that conditions (a) and 
(c) imply that the processes in S do not have distributed knowledge of ip. 
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in the system. For suppose that q had a message buffer such that once a message was 
in g's buffer, then as long as q did not crash, q would eventually receive the message. 
Consider two runs r x and r 2 such that (r^m) ~ g (r 2 ,m), F(ri) = F(r 2 ), and F(ri) 
consists of all processes other than q. Moreover, suppose that there is a message msg in 
g's buffer in (ri,m), but not in (r 2 ,m). By A2, there are extensions r[ and r' 2 of (ri,m) 
and (r 2 ,m) such that all processes other than q crash in round to + 1 in both r[ and r 2 
and (r[,m') ~ g (r 2 ,m') for all m' > to. But this is impossible, since q receives msg in 
r[ but not in r' 2 . More generally, A2 says that communication is unreliable. It does not 
hold if the network cannot lose all messages that might be in transit at any given time. 

A4 says, among other things, that if each of the processes in S considers -up possible, 
where ip is a stable failure-insensitive formula local to some process, then there is a point 
where -up is true that all the processes in S simultaneously consider possible. A4 is 
perhaps the least standard property. It holds if processes are essentially using a full- 
information protocol (FIP) [Coa86, FHMV95] and if 7Z places some restrictions on the 
information they can get from failure detectors. With an FIP, when a process p sends a 
message to q, it sends complete information about its state. The following example shows 
that, without FIPs, A4 can fail to be true. Consider a system 1Z where processes send 
messages that are formulas in the language defined in Section 2.3. Moreover, assume 
that every message sent is true at the time that it is sent. Let (r, to) be a point in 1Z 
such that neither p nor q has crashed at (r, to), and at some time to" < to, q sends a 
message msg to p', which p' receives. After receiving msg, p' sends p a message saying 
crash (g)Vsendg(p', msg), which p receives by time to. Further suppose that p' has a perfect 
failure detector, and there is another run r' in 7Z such that r p (m) = r' p (m') and, in r', 
process p' knows that q has crashed (since its failure detector reported this) and q does 
not send p' the message msg. It follows that (lZ,r,m) \= K p (crash(q) V send 9 (j/, msg)) A 
-iXp (crash (g)) A ->K p (send q (p' , msg)). Process p knows crash(g) V send 9 (p', msg) because 
it received a message from p' saying this, and messages are known to be truthful in 1Z. 
Process p does not know crash (q) (since q actually has not crashed at the point (r, to)) 
nor does p know send g (j9', msg) (since send 9 (p', msg) is not true at (r', to')). But then 
A4 does not hold in 1Z for ip — send g (p', msg) and S = {p}. For suppose it did hold. 
Then there must be a point (r",m) in 1Z such that (a) r' p \m) = r p (m), (b) r' q \m) is a 
prefix of r q (m) (since q does not crash in r), and (c) (1Z,r",m) |= -isend 9 (p', msg). Since 
(1Z, r, to) |= K p (crash(q) V send q (p', msg)), it follows that (TZ,r",m) |= crash(g), violating 
the assumption that r q (m) is a prefix of r q (m). 

In this example, p' did not tell p all it knew, which is precisely what cannot happen 
with a full-information protocol. Assuming that 7Z is generated by an FIP, under reason- 
able assumptions about the runs in 1Z (discussed below), 1Z will satisfy A4. To see why, 
observe that given (r, to) and ip as in the hypotheses of A4, we can construct the run r' 
as follows. First note that (7Z, r, 0) \= -up, for otherwise, since tp is stable and local to p, 
ip would be true at all points in 1Z and so would K q ip for all q G Proc. Thus, let m p be 
the first time in r where tp becomes true. If m p > m, then take r' — r and S = Proc; A4 
trivially holds in this case. If m p < to, let SC. Proc be the processes that do not know 
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tp at (r,m). If processes are following a full-information protocol, there can be no chain 
of messages from p to a process q G S between times m p and m in r, for if there were, 
q would know tp at (r, m). 5 For each process q G Proc, let m q be the least time at or 
before to at which there is a message chain from p to q in r between m p and m g , if there 
is such a time; otherwise, we take m q — m + 1. Note that for q E S, we have to 9 = m+ 1. 
We then construct r' so that r' q {m') = r q (mf) for m' < m q — 1; if g does not crash in r 
between times m g and m inclusive, then r' q (m') = r q (m q — 1) for w! > m q ; otherwise, 
r q (m') = r q (m q — 1) • crash q for to' > m q . By construction, we have r q (m) = r'(m) for 
q G S. For g' ^ 5, we have that r q ,(m) is either r q '(m q — 1) or r q >{m q — 1) ■ crash q >. The 
reason we need to add crash q i is that the failure detector of some process q <E S might 
report that q' fails in r. Since r q (m) = r' q (m), if g's failure detector is accurate, it must be 
the case that q' also fails in r'. As long as r' G 7Z, it is easy to see that the point (r', to) 
satisfies the requirements of (this instance of) A4. For by construction, r q (m) = r q (m) 
for q G S] and for q' S, the construction guarantees that either r' q ,{m) = r q /(m) or 
r' q ,{m) = r q /(m q ) ■ crash q >. By choice of m p , we have that (TZ, r, m p — 1) |= -up. Note that 
if (lZ,r,m) |= -icrash(p) then r' p {m) = r p {m p — 1); otherwise, either r' p {m) = r p {m p — 1) 
or r' p {m) = r p {m p — 1) • crash p . Since <p is insensitive to failure by p, in either case, we 
have that (TZ, r', m) \= -up. Thus, (r', m) satisfies the requirements of A4. 

This argument shows is that as long as it is the case that, for each formula tp and 
point (r, to) satisfying the hypotheses of A4, there is a run r' G 1Z as constructed above 
(actually, it suffices that there is a run in TZ that extends (r', to)), then 1Z satisfies 
A4. Thus, for example, it cannot be the case that the failure detector reports in 1Z are 
correlated with message delivery, so that a report from a failure detector saying that p is 
faulty is accurate iff p did not receive a message from q. We do not attempt to completely 
characterize the conditions under which 7Z satisfies A4 here. 

A5 t says that any subset of processes of size at most t may fail in some run. This is 
a standard assumption in the literature. Note that A5 t implies A5 t ' if t > t' . 

Theorem 3.6 below shows that if TZ attains UDC and satisfies A1-A4 and A5„ (or 
A5„_i) and one other quite innocuous condition, then TZ can simulate perfect failure 
detectors. The "innocuous condition" is the following: Clearly any solution to UDC 
should allow a process to initiate any of its actions at any time. To guarantee that TZ 
can simulate perfect failure detectors, it is necessary that, in each run of r, the correct 
processes (if there are any) initiate actions infinitely often. That is, for all runs r, if 
F(r) 7^ Proc, then for all times to, some correct process in r initiates an action after 
(r, to). Intuitively, if actions are initiated infinitely often, the correct processes will need 
to be able to detect failures in order to attain UDC repeatedly. On the other hand, if 
the correct processes do not initiate actions after some point, there will be no need for 

5 There is a message chain from p to q between m p and m > m p if there is a sequence of messages 
msg 1; . . . , msg fe and processes pi, ■ ■ ■ ,Pk+i such that (a) msgi is sent by pi to Pi + \ and is received, (b) 
Pi+i sends msg i+1 after receiving msg i7 (c) p = pi, (d) q = pt+i, (e) p sends msgj^ at or after m p , and 
(f) q receives msg fe+1 at or before m. If the processes follow a full-information protocol, then when pi + \ 
receives msg i7 all the stable facts that pi knew when pi sent msgj. 
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them to detect failures after this point. To see the need for this condition, suppose that 
no actions are initiated after time 17, even though there are some correct processes in TZ 
and UDC is attained for all these actions by time 25. Now consider a process q that fails 
after time 17. There is no need for processes to know that q has failed, since UDC is not 
required after time 25. 

To summarize, our result can be viewed as saying that under some relatively innocuous 
assumptions (A1-A3 and the assumption that correct processes initiate actions infinitely 
often), if any subset of processes may fail (A5 n ) and the processes are telling each other 
as much as they can (A4), then being able to attain UDC is tantamount to being able 
to simulate perfect failure detectors. 

Before proving Theorem 3.6, we prove two preliminary results. The first shows that, 
in the contexts of interest to us, weak accuracy and strong accuracy are equivalent. The 
second provides a characterization of the facts that must be known by a process before 
it can perform a coordination action a. Specifically, a process must know that if there 
are any correct processes at all, then one of these knows that a has been initiated. 



Proposition 3.4: If TZ satisfies Al and A5 n -i then TZ satisfies weak accuracy iff TZ 
satisfies strong accuracy. 



It follows from Proposition 3.4 that if TZ satisfies Al and A5 n _i, then TZ has strong failure 
detectors iff TZ has perfect failure detectors. (Since A5 n implies A5 n _i, this is a fortiori 
the case if TZ satisfies Al and A5„.) 



Proposition 3.5: If TZ satisfies Al, A2, and A4, then 



,p'GProc l\aeA p i 

\K p {\ri\t p ,(a) A A gGP roc 0(K q \ri\t p/ (a) V crash(g))) 
=>• K p (\l qeProc □^crash(g) =>- y qePr0C (K q \n\y(a) A D ^cras%) 



We are now ready to state our theorem. 



Theorem 3.6: Suppose TZ is the system generated by a protocol that attains UDC, TZ 
satisfies A1-A4 and A5 n -\, and for each run r ETZ, if F(r) 7^ Proc, then infinitely many 
actions are initiated in r (i.e., infinitely many events of the form init p [a) appear in r). 
Then the system TZ^ has perfect failure detectors. 

There are two issues worth noting regarding Theorem 3. 6. 6 First, the alert reader 
may have noticed an apprarent contradiction in our results. Proposition 2.4 states the 
UDC can be attained in contexts where communication is reliable, without using failure 

6 We thank one of the reviewers of this paper for pointing out these issues and encouraging us to 
discuss them. 
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detectors. Theorem 3.6 states that, if UDC can be attained, then perfect failure detectors 
can be simulated. Thus, perfect failure detectors can be simulated in systems where 
communication is reliable. Since it is well known that Consensus can be attained with 
perfect failure detectors, this suggests that Consensus can be attained in systems where 
communication is reliable, regardless of the number of process failures. But this is well 
known to be false [FLP85]. 

Our results are correct. We escape from the contradiction because, as we noted earlier, 
A2 specifically precludes reliable communication. Thus, Theorem 3.6 does not apply to 
systems of the type considered in Proposition 2.4. This observation does emphasize that 
our main results apply only to systems where communiation is unreliable. 

Second, the assumption that infinitely many actions must be initiated in each run of 
TZ in Theorem 3.6 may strike some readers as unduly strong (although it can be argued 
that a service should expect to operate indefinitely and therefore handle infinitely many 
requests). In any case, the theorem can be rephrased in a way that might make it 
more palatable. Consider a context that allows solutions to UDC and satisfies A1-A4 
and A5 n _i. Then there is a joint protocol (Pi, . . . , P n ) that, when run in that context, 
generates a system 1Z such that Vj has perfect failure detectors. The proof of this result 
is essentially identical to that of Theorem 3.6: the joint protocol (Pi,...,P n ) is one 
where each process that does not crash initiates an infinite number of actions. Indeed, 
every joint protocol where each process that does not crash initiates an infinite number 
of actions generates a system 1Z where VJ has perfect failures detectors. Thus, the result 
really shows that in contexts where UDC can be solved, UDC can be used to generate 
perfect failure detectors. 

4 Generalized Failure Detectors 

Theorem 3.6 shows that if at as many as n — 1 processes can fail, then UDC essentially 
requires perfect failure detectors. On the other hand, as Gopal and Toueg [GT89] show, 
UDC is attainable without using failure detectors in contexts where there are less than 
n/2 failures. We now generalize both of these results, characterizing the type of failure 
detector needed to attain UDC if there is a bound of t on the number of possible failures, 
for all values of t. 

A generalized failure detector reports that (it suspects that) at least k processes in a 
set S are faulty. 7 As discussed in the Introduction, such generalized failure detectors are 
appropriate when processes can observe faulty behavior in some component (s) without 
being able to tell which processes in the component are actually faulty. We model such 
generalized suspicions by using events of the form suspect p (S, k), with k < \S\. 8 We are 

7 Despite the name, generalized failure detectors are a special case of failure detector as defined in 
Section 2.2, as well as being a special case of the failure detectors defined by Aguilera, Toueg, and 
Deianov [ATD99] . 

8 Again, it is not necessary that the report of the failure detector has the form (S, k). We can define 
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interested in generalized failure detectors that give useful information. Of course, what 
is "useful" may depend on the application. Given a system TZ and an upper bound of t 
on the number of failures that may occur in a run r of TZ, we say that suspect (S, k) is a 
t-useful failure- detector event for r if (a) F(r) C S, (b) n — \S\ > min(t,n — 1) — k (or, 
equivalently, k > \S\ — n + min(t,n — 1)), and (c) k < \S\. Intuitively, if a generalized 
failure detector is "good" , then some of its reports are t-useful failure events. Note that 
if p learns at the point (r, m) that there are k faulty processes in S and n — \S\ > 
min(t,n — 1) — k, then p can conclude that, if there are any correct processes at all in 
r, then one of the processes in Proc — S is correct at (r,m) (although it may not know 
which one). Just knowing that some processes in a set are correct is not useful in general. 
For example, if t < n, then all processes know that at least n — t processes in Proc are 
correct. As we shall see, what makes this fact useful is that F(r) C S. 

A generalized failure detector in TZ is t-useful if for all r e TZ and processes p, we 
have: the following: 

Generalized Strong Accuracy: if suspect p (S, k) is in r p (m), then there is a subset S' C S 
such that \S'\ = k and for all q G S', we have that crash q is in r q (m). 

Generalized Impermanent Strong Completeness: if p is correct, then there is a t-useful 
failure-detector event for r in r p (m), for some m. 

Note that it is trivial to construct a t-useful failure detector in a context with at 
most t failures if t < n/2: for each S C Proc with \S\ = t, output (S,0) infinitely often. 
Suspecting no processes in any subset S trivially satisfies generalized strong accuracy, 
and in every run r at least one t-sized subset of Proc must contain F(r). Whenever 
F(r) C S, then (S, 0) is a t-useful failure-detector event. 

Also note that if suspect (S, k) is an (n — l)-useful or n-useful failure-detector event, 
then we must have \S\ = k, since the only way to have k > \S\ — 1 is to have k = \S\. 
Thus, we can easily convert an n-useful or (n — l)-useful generalized failure detector to 
a perfect failure detector, by just reporting suspect p (S') at time m in run r if S' is the 
union of the sets S such that the generalized failure detector has reported suspect (S, k) 
with \S\ — k prior to time m. Conversely, we can easily convert a perfect failure detector 
to an n-useful (and hence {n — l)-useful) failure detector. Given a history for process 
p, we simply replace each event suspect p (S) by the event suspect (S ", k), where S' is 
the union of S together with all the sets that appeared in failure-detector events of the 
perfect failure detector earlier in the history, and k = \S'\. It is easy to see that this 
gives an n-useful failure detector. Thus, the following result generalizes Proposition 3.1 
and Gopal and Toueg's result. 

Proposition 4.1: There is a protocol that attains UDC in a context with a bound oft 
on the number of failures and with t-useful generalized failure detectors. 

(/-generalized failure detector whose reports can be mapped to pairs (S, k). For ease of exposition, we 
do not bother doing this. 



17 





< t < n/2 


n/2 < t < n - 1 


n — 1 < t < n 


Reliable channels UDC 

consensus 


no FD 

ow t 


no FD 
Strong 


no FD 
Perfect t 


Unreliable channels UDC 

consensus 


no FD 
OW t 


t-useful f 
Strong 


Perfect f 
Perfect f 



Table 1: The type of failure detector needed for UDC vs. consensus; f indicates optimality. 



Since, as observed earlier, it is trivial to construct a t-useful failure detector in a 
context with at most t failures, if t < n/2, we get the following result of Gopal and 
Toueg [GT89] as an immediate corollary to Proposition 4.1. 

Corollary 4.2: If t < n/2, then there is a protocol that attains UDC without failure 
detectors. 

We want a converse to Proposition 4.1 that generalizes Theorem 3.6. We show that if 
processes can perform UDC in a context with a bound t on the number of failures, then 
t-useful generalized failure detectors can be simulated in that context. 

Given system 1Z, construct system VJ' as follows. Fix an order So, . . . , SV-i of the 
subsets of Proc. Let TV = {f'(r) : r e ^} where, for each run r e 1Z, f'(r) is constructed 
exactly as f(r) in Section 3, except that P3 is replaced by the following condition. 

P3'. (f'(r)) p (2m+ 1) = (/'(r)) p (2m) • suspect' p (Si, k), where / is the length of the history 
r p (m + 1) mod 2™ and 

k = max{/c' : (1Z,r,m) \= K p (k' processes in Si have crashed)}. 

Theorem 4.3: Suppose 1Z is the system generated by a protocol that attains UDC in 
a context with at most t failures, 1Z satisfies A1-A4 and A5 t , and for each run r E 71, 
if F{r) 7^ Proc, then infinitely many actions are initiated in r. Then 1ZJ has t-useful 
generalized failure detectors. 

As with Theorem 3.6, we can restate Theorem 4.3 to say that in any context with at 
most t failures where UDC can be attained, there is a joint protocol P that generates a 
system 1Z such that TV' has t-useful failure detectors. 

5 Conclusions 

We have shown that the problem of Uniform Distributed Coordination in asynchronous 
systems varies in its complexity both with communication guarantees and with the num- 
ber of failures that must be tolerated (see Table 1). Unlike consensus (or nUDC, for that 



18 



matter), UDC is sensitive to communication guarantees in the contexts that we consider 
in this paper. This is significant since UDC is likely the only acceptable reliability guar- 
antee for many wide-area and collaborative mobile applications, precisely where reliable 
communication cannot be assumed. 

Note that we have completely characterized the type of failure detector required to 
attain UDC for all values of t. For consensus, it is known that OW is necessary and 
sufficient if t < n/2. (Recall that in this case no failure detectors at all are necessary 
to attain UDC.) While strong (actually, impermanent-strong) failure detectors suffice for 
consensus for n/2 < t < n, there is no characterization of exactly the type of failure 
detector that is required. The notion of t-useful failure detectors defined here may prove 
useful in that regard. We leave exploring this issue for future work. 

As we mentioned in the introduction, in a paper written in response to the conference 
version of this paper, Aguilera, Toueg, and Deianov [ATD99] provided an elegant alterna- 
tive characterization of the weakest failure detector required for UDC. 9 They show that 
the weakest failure detector for this problem is one that satisfies strong completeness and 
a notion of accuracy even weaker than what we have called weak accuracy: if there is a 
correct process, then at all times, some correct process is not suspected (but a different 
correct process may be the one that is not suspected at every time). They show that 
if UDC can be solved with a failure detector F, then F can be reduced (i.e., effectively 
transformed) into this weakest failure detector. Technically, this result is incomparable 
to our result. On the one hand, it is stronger, in that it gives a failure detector that solves 
UDC in all contexts (not just ones satisfying A1-A4 and A5 t , which are the only ones 
considered in this paper), and is in a precise sense the weakest failure detector needed to 
solve UDC. On the other hand, as we observed in the discussion preceding Theorem 3.6, 
because our results do not proceed by reduction of one failure detector to another, our 
results apply even in cases where some technology other than failure detectors is used to 
solve UDC. 

Appendix — Proofs of Propositions and Theorems 

Proposition 2.3: There is a protocol that attains nUDC without the use of failure 
detectors in every context where communication is fair (although possibly unreliable), 
even if there is no bound on the number of failures. 

Proof: We just sketch the protocol here, since it is so simple. Whenever a process p 
wants to attain nUDC of action a (i.e., if init p (a) is in p's history) p goes into a special 
nUDC(a) state. If a process is in an nUDC(o;) state, it performs a and sends an a-message 
repeatedly to all other processes (which, intuitively, tells them to perform a). If a process 

9 Actually, their results are given for URB, uniform reliable broadcast, but URB and UDC are iso- 
morphic problems; the init and do in UDC correspond to broadcast and deliver in URB. 
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receives an a-message, it goes into an nUDC(o;) state, if it has not already done so. It is 
easy to see that this protocol attains nUDC. 10 | 

Proposition 2.4: There is a protocol that attains UDC without the use of failure 
detectors in every context where communication is reliable, even if there is no bound on 
the number of failures. 

Proof: We proceed just as in the proof of Proposition 2.3, except that before performing 
the action a, a process simply sends a message to all other processes telling them to 
perform a and inform all other processes if they have not already done so. More precisely, 
if initp(a) is in p's history, p goes into a special UDC(o;) state. If a process is in a UDC(a) 
state, it sends an a-message to all processes and then performs a. If a process receives 
an a message, it goes into a UDC-state if it has not already done so. Since a process 
q performs a only after sending out an a-message to all processes and, by assumption, 
communication is reliable, if q performs a, then other correct processes will receive the 
message, and thus also perform a, even if q crashes. I 

Proposition 3.1: There is a protocol that attains UDC in every context with strong 
failure detectors, even if there is no bound on the number of failures. 

Proof: The proof is similar in spirit to that of Proposition 2.3. Whenever a process 
wants to attain UDC of action a, it goes into a special UDC(a) state. If a process p 
is in a UDC(a) state, it sends an a- message repeatedly to all other processes (telling 
them to perform a). Process p performs a if it is in a UDC(a) state and if, for every 
process q, p receives an acknowledgment from q to its a-message or p's failure detector 
says or has said that q is faulty. However, p continues to send a-messages (even after 
performing a) to all processes from which it has not received an acknowledgment until it 
has received an acknowledgment from all processes (which may never happen). 11 Every 
time a process q receives an a-message from p, q sends an acknowledgment to p; it also 
goes into a UDC(a) state if it has not already done so. 

To show that this protocol attains UDC, it suffices to show that, in every run, (1) if 
a process p is in a UDC(a) state, then p will eventually perform a or crash and (2) if p 
performs a then every correct process performs a. To see that (1) holds, suppose that p 
is in a UDC(o;) state in run r and does not crash. Suppose, by way of contradiction, that 
p does not perform a in run r. That means that there must be some process q such that 
p's failure detector never reports q as faulty and p does not receive an acknowledgment 
from q. Since p has a strong failure detector, if q is faulty, then at some point in r, p 

10 This protocol, like most of the others we present in this paper, does not have any mechanism for 
termination. Processes keep sending messages forever. Since message communication is unreliable, it is 
not hard to show that there is in fact no protocol that attains nUDC and terminates. We can deal with 
this problem by adding a heartbeat mechanism [ACT97], but this issue is beyond the scope of this paper. 

n If p has a strongly accurate failure detector rather than just a weakly accurate failure detector, it 
can actually stop sending messages after performing a. This follows from the proof of Proposition 3.1. 
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must receive a report to this effect from its failure detector. Thus, q must be correct in r. 
Since p repeatedly sends an a-message to q, by R5, q must receive the message infinitely 
often. That means it sends an acknowledgment back to p infinitely often. By R5 again, 
p must receive the acknowledgment, contradicting the assumption that it does not. 

To see that (2) holds, first note that it holds vacuously if there are no processes correct 
in r. If there is some process that is correct in r, then since p has a weakly accurate 
failure detector, there is some correct process, say q*, that p never suspects. Thus, if p 
performs a, it must receive an acknowledgment from q* to its a-message. Hence, q* goes 
into a UDC(a;) state and never crashes, so (1) implies that it also performs a. Since q* 
is correct, all correct processes eventually receive an a- message from q* and so perform 
a. | 

Proposition 3.4: If TZ satisfies Al and A5 n -i, then TZ satisfies weak accuracy iff TZ 
satisfies strong accuracy. 

Proof: Let TZ satisfy Al, A5„_i, and weak accuracy. If 1Z does not satisfy strong 
accuracy, then there is a point (r,m) and processes p, q such that q G Suspects p (r, m) 
and q has not failed in r. Let S' = Proc — {q}. By A5 n _i, there is a run r' where all 
the processes in S' fail. Thus, by Al, there is a run r" extending (r, m) such that all 
the processes in S' fail in r". It follows that q is the only correct process in r". By 
weak accuracy, we must have that q is never suspected as faulty in r", contradicting the 
assumption that it is in fact suspected by p. I 

Proposition 3.5: IfTZ satisfies Al, A2, and A4, then 

f\aeA pl \K p (\n'\tp>(a) A AyeProc 0(-K"ginitp/(o!) V crash(g))) 
=>" ^ P (V 9 eProc □^crash(g) =>- V g eProe(^ ini V( Q! ) A D ^crash(g) 

Proof: Suppose, by way of contradiction, that for some p,p' G Proc and a G A p >, we 
have that 



(TZ,r,m)\= K p (\ri\t p i(a) A A ge p r oc 0(i^ g init p '(a) V crash(g))) A 

^^ P (V ge Proc D ^crash(g) =>• \j qe p roc (K q \ri\L pl (a) A D^crash(g)) J . 



Then there must be a point (r , m') ~ p (r, m) such that 

(TZ, r 1 , m!) |= init p /(a) A \J □-icrash(g) A f\ (□-icrash(g) ^> -iK 9 init p /(a 

<jgProc <jGProc 

We have (TZ, r 1 ,™') |= f\ q ^ F ^ r i^K q m\t p i(a). Since p' knows that it initiated a at 
(r l ,m'), we must have p' G F(r l ). Moreover, F(r l ) ^ Proc, because (TZ, r^m') |= 
V (?e Proc D ^crash(g). 
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Let S — Proc — i^r 1 ). By A4 with if = dcf init p /(a), there exists a point (r 2 ,m r ) such 
that (r 2 ,m r ) ~ g (r l ,m') for q G S and (TZ,r 2 ,m') |= -iinit p /(a). For all q G 5), we have 
that (r 2 ,m') ~ g (r^m'). It follows that no process in S has crashed by (r 2 ,m r ). By 
Al, there exists a run r 3 extending (r 2 ,m') such that F(r 3 ) = i^r 1 ). Since r 3 extends 
(r 2 ,m'), we must have r q (m') = r 3 (m') for all g G S. By A2, there exist runs r 4 and r 5 
extending r 1 and r 3 , respectively, such that r q (m") = r^m") for to" > to'. Moreover, all 
the processes in F(r Y ) (and, in particular, p') crash by time m' + l in r 4 and r 5 . Thus, the 
event init p /(o;) does not appear in r 5 , which means that (TZ, r 5 , to') |= Ages lA'ginit^,/ (ct) . 
Since r 4 and r 5 are indistinguishable to such q from to' onward, we have (7£, r 4 , to') |= 
A g es i-fQinitp/ (en). Since (r, to) ~ p (r^m') and r 4 extends (r^m'), we must have 
(r, to) ~ p (r 4 ,m'). Hence, we have (TZ,r,m) \= ->K p (<>(K q m'\tpi(a) Vcrash(g))) for q <E S. 
This gives the desired contradiction to (1). | 

Theorem 3.6: Suppose TZ is the system generated by a protocol that attains UDC, TZ 
satisfies A1-A4 and A5 n _\, and for each run r G TZ, if F(r) ^ Proc, then infinitely many 
actions are initiated in r (i.e., infinitely many events of the form init p (a) appear in r). 
Then the system VJ has perfect failure detectors. 

Proof: It is immediate from the construction that p crashes in (r, to) iff p crashes in 
(/(r),2m). It easily follows that p's failure detector satisfies strong accuracy. To show 
that it satisfies strong completeness, suppose that p is correct and q fails in run f(r) G VJ 
and hence also in run r G TZ. Since infinitely many actions are initiated in r (and hence 
f(r)), there must be some action a initiated by some correct process, say p' , in fir) after 
q has failed. Since TZ satisfies UDC, by DC1 and DC2, p must eventually perform a 
in run r, say at time to. Moreover, by DC2, p knows that, for each process q' (and, in 
particular, q), q' must eventually either crash or must perform a. Using DC3, it easily 
follows that we must have 

(TZ,r,m) \= K p {\n\t p i{a) A f\ 0(K q/ \ri\t pl (a) V crash(g'))). 

q'eProc 

Since TZ satisfies Al, A2, and A4 by assumption, it follows from Proposition 3.5 that 

(TZ,r,m) \= K p ( \f □^crash(g') =>• \f (iQ,iniy (a) A □-.crash(g'))V (2) 

VeProc (j'GProc ' 

By way of contradiction, suppose that (TZ,r,m) \= □-iApCrash(g). Since q crashes in 
r before p' initiates a, it is easy to show that (TZ, r, m) \= -iK q \r\\t p '(a). Thus, there must 
exist a point (r l ,m') ~ p (r,m) such that (TZ^- 1 ,^) |= ->cras\\(q) A ->K p K q \n\t p i(a). Since 
K q m\tpi(ct) is stable, local to q, and (by A3) insensitive to failures by q, by A4, there must 
exist a point (r 2 ,m') ~ p (r l ,m r ) such that r 2 (m') is a prefix of r l q (m') and (TZ,r 2 ,m') |= 
~'K q \ri\t p f(a). Since r 2 (m!) is a prefix of r q (m'), it is easy to show that (TZ,r 2 ,m!) |= 
-•crash (g). Thus, (r 2 ,m!) ~ p (r,m) and (TZ,r 2 ,m!) \= -icrash(g) A -ii^ g init p /(a). 
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By A5 n _i and Al, there is a run r 3 extending (r 2 ,m') such that all processes except 
q fail in r 3 . Since (r 3 ,m') ~ p (r, ra), and 

(TZ,r 3 ,m') |= D^crash(g) A -iK q \n\y(a) A /\ Ocrash(g'). 

<?'eProc-{<?} 

Since (r, m) ~ p (r 2 ,m r ) ~ p (r 3 ,m'), this contradicts (2). | 

Proposition 4.1: There is a protocol that attains UDC in a context with a bound oft 
on the number of failures and with t-useful generalized failure detectors. 

Proof: To attain UDC of action a, a process goes into a special UDC(o;) state. If a 
process p is in a UDC(a) state, it sends an a-message repeatedly to all other processes 
from which it has not received an acknowledgment, telling them to perform a. Process 
p performs a at time m if, by time m, there is a set S C Proc and k < \S\ such 
that (a) it is in a UDC(o;) state, (b) its failure detector has reported suspect p (S, k), (c) 
it has received messages from all the processes in Proc — S acknowledging a, and (d) 
n — \ S\ > min(t, n — 1) — k. Process p continues to send ct-messages to each q e S until it 
either receives an acknowledgment from q or knows q to be faulty. (Note that knowledge 
is only necessary for the protocol's termination.) A process that receives an a- message 
from p sends an acknowledgment to p and goes into a UDC(a) state if it has not already 
done so. 

To show that this protocol attains UDC, again it suffices to show that, in every run, 
(1) if a process p is in a UDC(o;) state, then p will eventually perform a or crash and (2) 
if p performs a then every other correct process performs a. For (1), suppose that p is in 
a UDC(a) state in run r and, by way of contradiction, p neither performs a nor crashes. 
Then p repeatedly sends an a-message in r to every process q. By R5, every correct 
process q will get it infinitely often. Since q acknowledges p's a-message each time it gets 
it, by R5, p will eventually get an acknowledgment from every correct process. Since p 
has a t-useful failure detector, if it is correct in r, there will be a t-useful failure-detector 
event, say suspectJS, k), in r p (m) for some m. Since p eventually gets acknowledgments 
from all the processes in Proc — S (since these, at least, are correct in r), it will eventually 
perform a, according to the algorithm. Thus, (1) holds. 

To see that (2) holds, the arguments for (1) show that if p performs a as a result 
of the failure-detector event suspect JS, k), all the processes in Proc — S have received 
an a message (and hence are in a UDC(o;) state) and Proc — S contains at least one 
correct process, say q, if there are any correct processes in r. Since q continues to send 
a-messages to all processes from which it has not received an acknowledgment, all the 
correct processes in r will eventually be in a UDC(a) state. It then follows from (1) that 
all the correct processes will perform a. | 

Theorem 4.3: Suppose 1Z is the system generated by a protocol that attains UDC in 
a context with at most t failures, 1Z satisfies A1-A4 and A5 t , and for each run r G 1Z, 
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if F(r) 7^ Proc, then infinitely many actions are initiated in r. Then Vj' has t-useful 
generalized failure detectors. 

Proof: Again, it is easy to see that each correct process p's failure detector satisfies 
generalized strong accuracy. To show that it satisfies generalized impermanent strong 
completeness, suppose that p is correct. Since infinitely many actions are initiated in r 
(and hence also in f(r)), there must be some action a initiated by a correct process p' 
in f(r) at a time after all the processes in F(r) (= F(f(r))) have failed in f(r). 

Since 1Z satisfies UDC, p must eventually perform a in run r, say at time m. As in 
the proof of Theorem 3.6, using Proposition 3.5, we can conclude that 

(n,r,m)\=K p ( \/ □ ^crash(g') \f (K q ,\ri\t p ,{a) A □-.crash(g'))') ■ (3) 

VGProc g'GProc ' 

Suppose, by way of contradiction, that p does not know that at least k = \F(r) \ —n + t 
processes have crashed at (r, m). Then there must be a point (r 1 , m!) ~ p (r, m) such that 
k' < k processes have crashed by (r 1 , m'). We must have (1Z, r, m) \= A q eF(r) ~ i K q \T\\t p '(a), 
since all the processes in F(r) crashed in r before p' initiated a. Consequently, it follows 
that (7?., r 1 , m') |= f\ q£F ^^K p (K q m\t p i(a)). By repeated applications of A4, there is a 
point (r 2 ,m') ~ p (r^m') such that (1Z, r 2 ,m f ) \= /\ q€F ^^K q \n\t p f(a). (We are using 
the fact that K q \n\t p /(a) is stable, local to q, and insensitive to failures by q. Thus, if 
-ii^ginitp/(o;) holds for some history of q, it holds for any prefix of that history or a prefix 
followed by a crash q event.) Thus, the only processes that may know init p /(a) in r are 
those in Proc — F(r). Since |-F(r)| = n — t + k, we have that |Proc — F(r)\ = t — k. Thus, 
at most t — k processes know init p /(a;) at the point (r 2 ,m'). 

Let Fi be the set of processes that have crashed by (r^m') and let F 2 be the set of 
processes that have crashed by (r 2 , m'). Since r 2 (m') is a prefix of r q (m') for all q G Proc, 
we must have F 2 C F\. Recall that = k' < k. We now proceed much as in the 
proof of Proposition 3.5 to construct a run extending (r 2 ,m') in which the processes in 
F{r) — F\ do not crash and and do not learn about init p /(o;). 

By A4, there exists a point (r 3 ,m') such that (r 3 ,m') ~ g (r 2 ,m') for q G F(r) and 
(TZ, r 3 , m') \= — iinitp/(ct). As in the previous application of A4, the set of processes that are 
faulty at the point (r 3 , m!) is a subset of F\ and hence consists of at most k' processes. By 
Al and A5 t , there exists a run r 4 extending (r 3 ,m') such that F(r 4 ) = (Proc — F(r))L)Fi. 
Since r 4 extends (r 3 ,m'), we must have r q (m') = r q (m r ) for all q G F(r). By A2, there 
exist runs r 5 and r 6 extending r 2 and r 4 , respectively, such that r b q (m") = r q (m") for 
m" > wl . Moreover, all the processes in Proc— F(r)UFi crash by time m'+l. Clearly p' ^ 
F(r) (since all the processes in F(r) crash before p' initiates a). Thus, the event init p /(a;) 
does not appear in r 6 . It follows that (7Z,r 5 ,m") \= A q eF(r) ~^K q m\t p i(a) for all m" > wl 
and q G F(r), so {1Z,r\m') |= A qeF(r) a^K q \n\t p/ (a) A A (?e (Proc-F(r))uF 1 Ocrash(g). Since 
(r,m) ~ p (r 1 ^'), (r- 1 ,^) ~ p (r 2 ,m'), and r 5 extends (r 2 ,m'), we must have (r,m) ~ p 
(r 5 ,m'). But this gives us the desired contradiction to (3). 
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Thus, p must know about at least k failures at the point (r,m). Let S be any set 
containing F(r). Our transformation from 7Z to VJ guarantees that eventually there 
will be a failure-detector event (S, k) in p's history, and this is a t-useful event. | 
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